After taking the CE course, participants will:
Get familiar with deep learning methods such as feedforward neural network, CNN, and RNN with hands-on how to apply these deep learning methods through R keras package with TensorFlow backend
Understand data science in general and the end-to-end data science project cycles.
Get familiar with cloud-based big data platforms (i.e., Databrick’s Spark) for data preprocessing and model development that are widely used in the development and production setting for industry and know how to transit from academia environment to enterprise environment quickly.
Learn soft skills to ensure the successful delivery of data science projects and get familiar with typical data science project pitfalls.
Data science is the discipline of making data useful. Ok…so what is it?
Engineering: the process of making everything else possible
Analysis: the process of turning raw information into insights in a fast way
Modeling: the process of diving deeper into the data to discover the pattern we don’t easily see
(It is a group work from https://github.com/brohrer/academic_advisory/blob/master/authors.md !)
Data environment: data storage, Kafka platform, Hadoop and Spark cluster etc.
Data management: parsing the logs, web scraping, API queries, and interrogating data streams.
Production: integrate model and analysis into the production system
Domain knowledge
Exploratory analysis
Story telling
Supervised learning
Unsupervised learning
Customized model development
Part of the deep learning slides are based on Andrew Ng’s course: Deep Learning Specialization: Super awesome!